Expand description

Please use quick-xml. This project was born as its successor, but quick-xml raised again.

High performance XML reader/writer.

Description

fast-xml contains two modes of operation:

A streaming API based on the StAX model. This is suited for larger XML documents which cannot completely read into memory at once.

The user has to explicitly ask for the next XML event, similar to a database cursor. This is achieved by the following two structs:

  • Reader: A low level XML pull-reader where buffer allocation/clearing is left to user.
  • Writer: A XML writer. Can be nested with readers if you want to transform XMLs.

Especially for nested XML elements, the user must keep track where (how deep) in the XML document the current event is located. This is needed as the

Furthermore, fast-xml also contains optional Serde support to directly serialize and deserialize from structs, without having to deal with the XML events.

Examples

Reader

use fast_xml::Reader;
use fast_xml::events::Event;

let xml = r#"<tag1 att1 = "test">
                <tag2><!--Test comment-->Test</tag2>
                <tag2>
                    Test 2
                </tag2>
            </tag1>"#;

let mut reader = Reader::from_str(xml);
reader.trim_text(true);

let mut count = 0;
let mut txt = Vec::new();
let mut buf = Vec::new();

// The `Reader` does not implement `Iterator` because it outputs borrowed data (`Cow`s)
loop {
    match reader.read_event(&mut buf) {
    // for triggering namespaced events, use this instead:
    // match reader.read_namespaced_event(&mut buf) {
        Ok(Event::Start(ref e)) => {
        // for namespaced:
        // Ok((ref namespace_value, Event::Start(ref e)))
            match e.name() {
                b"tag1" => println!("attributes values: {:?}",
                                    e.attributes().map(|a| a.unwrap().value)
                                    .collect::<Vec<_>>()),
                b"tag2" => count += 1,
                _ => (),
            }
        },
        // unescape and decode the text event using the reader encoding
        Ok(Event::Text(e)) => txt.push(e.unescape_and_decode(&reader).unwrap()),
        Ok(Event::Eof) => break, // exits the loop when reaching end of file
        Err(e) => panic!("Error at position {}: {:?}", reader.buffer_position(), e),
        _ => (), // There are several other `Event`s we do not consider here
    }

    // if we don't keep a borrow elsewhere, we can clear the buffer to keep memory usage low
    buf.clear();
}

Writer

use fast_xml::Writer;
use fast_xml::events::{Event, BytesEnd, BytesStart};
use fast_xml::Reader;
use std::io::Cursor;
use std::iter;

let xml = r#"<this_tag k1="v1" k2="v2"><child>text</child></this_tag>"#;
let mut reader = Reader::from_str(xml);
reader.trim_text(true);
let mut writer = Writer::new(Cursor::new(Vec::new()));
let mut buf = Vec::new();
loop {
    match reader.read_event(&mut buf) {
        Ok(Event::Start(ref e)) if e.name() == b"this_tag" => {

            // crates a new element ... alternatively we could reuse `e` by calling
            // `e.into_owned()`
            let mut elem = BytesStart::owned(b"my_elem".to_vec(), "my_elem".len());

            // collect existing attributes
            elem.extend_attributes(e.attributes().map(|attr| attr.unwrap()));

            // copy existing attributes, adds a new my-key="some value" attribute
            elem.push_attribute(("my-key", "some value"));

            // writes the event to the writer
            assert!(writer.write_event(Event::Start(elem)).is_ok());
        },
        Ok(Event::End(ref e)) if e.name() == b"this_tag" => {
            assert!(writer.write_event(Event::End(BytesEnd::borrowed(b"my_elem"))).is_ok());
        },
        Ok(Event::Eof) => break,
        Ok(e) => assert!(writer.write_event(e).is_ok()),
        // or using the buffer
        // Ok(e) => assert!(writer.write(&buf).is_ok()),
        Err(e) => panic!("Error at position {}: {:?}", reader.buffer_position(), e),
    }
    buf.clear();
}

let result = writer.into_inner().into_inner();
let expected = r#"<my_elem k1="v1" k2="v2" my-key="some value"><child>text</child></my_elem>"#;
assert_eq!(result, expected.as_bytes());

Features

fast-xml supports the following features:

  • encoding — Enables support of non-UTF-8 encoded documents. Encoding will be inferred from the XML declaration if it will be found, otherwise UTF-8 is assumed.

    Currently, only ASCII-compatible encodings are supported, so, for example, UTF-16 will not work (therefore, fast-xml is not standard compliant).

    List of supported encodings includes all encodings supported by encoding_rs crate, that satisfied the restriction above.

  • serialize — Enables support for serde serialization and deserialization

  • escape-html — Enables support for recognizing all HTML 5 entities

Modules

Serde Deserializer module

Manage xml character escapes

Defines zero-copy XML events used throughout this library.

Module to handle custom serde Serializer

Structs

A struct to write an element. Contains methods to add attributes and inner elements to the element

A low level encoding-agnostic XML event reader.

XML writer.

Enums

(De)serialization error

The error type used by this crate.

Type Definitions

A specialized Result type where the error is hard-wired to Error.